Genetic association of ANRIL with susceptibility to Ischemic stroke: A comprehensive meta-analysis

Background Ischemic stroke (IS) is a complex polygenic disease with a strong genetic background. The relationship between the ANRIL (antisense non-coding RNA in the INK4 locus) in chromosome 9p21 region and IS has been reported across populations worldwide; however, these studies have yielded inconsistent results. The aim of this study is to clarify the types of single-nucleotide polymorphisms on the ANRIL locus associated with susceptibility to IS using meta-analysis and comprehensively assess the strength of the association. Methods Relevant studies were identified by comprehensive and systematic literature searches. The quality of each study was assessed using the Newcastle-Ottawa Scale. Allele and genotype frequencies were extracted from each of the included studies. Odds ratios with corresponding 95% confidence intervals of combined analyses were calculated under three genetic models (allele frequency comparison, dominant model, and recessive model) using a random-effects or fixed-effects model. Heterogeneity was tested using the chi-square test based on the Cochran Q statistic and I2 metric, and subgroup analyses and a meta-regression model were used to explore sources of heterogeneity. The correction for multiple testing used the false discovery rate method proposed by Benjamini and Hochberg. The assessment of publication bias employed funnel plots and Egger’s test. Results We identified 25 studies (15 SNPs, involving a total of 11,527 cases and 12,216 controls maximum) and performed a meta-analysis. Eight SNPs (rs10757274, rs10757278, rs2383206, rs1333040, rs1333049, rs1537378, rs4977574, and rs1004638) in ANRIL were significantly associated with IS risk. Six of these SNPs (rs10757274, rs10757278, rs2383206, rs1333040, rs1537378, and rs4977574) had a significant relationship to the large artery atherosclerosis subtype of IS. Two SNPs (rs2383206 and rs4977574) were associated with IS mainly in Asians, and three SNPs (rs10757274, rs1333040, and rs1333049) were associated with susceptibility to IS mainly in Caucasians. Sensitivity analyses confirmed the reliability of the original results. Ethnicity and individual studies may be the main sources of heterogeneity in ANRIL. Conclusions Our results suggest that some single-nucleotide polymorphisms on the ANRIL locus may be associated with IS risk. Future studies with larger sample numbers are necessary to confirm this result. Additional functional analyses of causal effects of these polymorphisms on IS subtypes are also essential.


Introduction
Stroke is the second leading cause of death in the world [1] and the first leading cause of death in China [2]. In 2017, the National Epidemiological Survey of Stroke in China (NESS-China) from 31 provinces reported that the incidence and mortality rates of stroke were 246.8 and 114.8 per 100,000 person-years, respectively, and it is estimated that about 3.4 million new stroke cases occur each year [3]. Stroke warrants some of the highest medical costs in China, costing nearly 75.6 billion yuan (RMB) in direct medical costs [4]. Hospitalization expenses are projected to increase significantly with the expected improvement in people's living standards [5]. Ischemic stroke (IS) accounted for 43.7%-78.9% of all stroke cases in China [6]. IS is a complex disorder with a strong genetic component [7]. Thrombosis of brain arteries secondary to atherosclerosis is considered one of the major pathophysiological mechanisms of IS [8]. Thus, studies into genetic susceptibility to atherosclerosis have attracted a lot of attention.
ANRIL (antisense non-coding RNA in the INK4 locus), which belongs to the long non-coding RNA family, was found to have a strong association with the risk for cardio-metabolic diseases [9], playing a key role in atherosclerotic diseases such as IS. A number of studies have explored the relationship between ANRIL and IS across populations worldwide. However, most of these studies used small sample sizes and the findings were inconclusive. Data from linkage and association studies showed that susceptible locus for common diseases had only minimal effects. Meta-analysis is a powerful tool that allows the detection and validation of minimal biological effects in human genetic association studies [10]. Researchers have investigated the role of a few single-nucleotide polymorphisms (SNPs) on the ANRIL locus in IS across different populations by meta-analysis. However, the association of other genetic variants and other SNPs in ANRIL with IS deserves further analyses. In addition, some recently published studies across ethnicities were found in the literature search. In this study, we conducted an updated meta-analysis on all available association study data to comprehensively evaluate the contribution of ANRIL to the risk of IS.

Study design
This research was conducted according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analysis) statement and the guidelines presented in Systematic Reviews of Genetic Association Studies by Sagoo et al. [10]. ANRIL polymorphism was used as the exposure and IS as an outcome. This work did not require the approval of an ethics committee and was not registered in any database. The completed PRISMA checklist and Metaanalysis on Genetic Association Studies Checklist are given in S1, S2 Appendices.

Data collection
All studies involving the relationship between ANRIL gene polymorphisms and stroke were identified independently by three investigators (Bai N, Liu W, and Zhou Q) by searching the following databases until August 2021: PubMed (from 1966), EMBASE (from 1966), the Cochrane Library (from 2003), ProQuest Dissertations & Theses Database (from 1980), Biosis Preview (from 1990), Web of Science (from 1990), China National Knowledge Infrastructure (CNKI, from 194), and Wanfang Database (including journal articles, dissertations or theses, and conferences literature, from 1990). We used the following keywords or their combinations in search strategies: "ANRIL", "CDKN2B-AS1", "antisense non-coding RNA in the INK4 locus", or "9p21" and "stroke", "cerebral infarction", or "cerebrovascular disease". We limited the search to only human studies. Examples of the keywords search strategy in PubMed are: ("ANRIL"[All Fields] OR "CDKN2B-AS1"[All Fields] OR "antisense non-coding RNA in the INK4 locus"[All Fields] OR "9p21"[All Fields]) AND ("stroke"[All Fields] OR "cerebral infarction"[All Fields] OR "cerebrovascular disease"[All Fields]).
The references listed in the retrieved articles and in review articles as well as abstracts from recent conferences were also searched for possible eligible studies. Only the most recent or complete reports were selected for analysis if the same or a similar patient cohort was included in several publications. There were no restrictions on the source of the control group, and studies in which the control groups were not in Hardy-Weinberg equilibrium were excluded [11].
Studies meeting the following criteria were included for meta-analysis: 1) genetic association studies of the ANRIL polymorphisms with IS were performed using a population (hospital)-based, case-control, nested case-control, or cohort design; 2) IS was diagnosed using a standard that has been widely accepted; 3) control subjects were unrelated individuals, with no symptomatic vascular disease as confirmed by physicians; 4) genotype or allele frequencies were reported in both patients with IS and in controls or could be calculated successfully; and 5) a genetic variant of ANRIL was included in at least two of the studies. Case-only studies, family-based studies, and review articles were excluded. The quality of included studies was assessed based on the published study [12] and the Newcastle-Ottawa Scale (NOS) [13]. A NOS score �7 was considered high quality [13].

Data extraction
Data were carefully extracted from all eligible studies independently by two authors (Liu W, Xiang T), and any disagreements were resolved by discussion. The following information was extracted: first author's surname, year of publication, country of origin, study design, sex composition of the case and control groups, ethnicity of the subjects studied, total number of subjects, definition and characteristics of cases and controls, genetic variants associated with IS, genotyping methods, distribution of genotypes and alleles, IS subtype (if reported), information on additional genetic variants, as well as gene-gene and gene-environment interactions (if investigated). Genotype frequencies were calculated where possible.
For studies that included subjects from different ethnic groups, data were extracted separately for each ethnic group. When some of the information was not available, we contacted the corresponding author by email for additional information.

Statistical analyses
Odds ratios (ORs) and pooled ORs with corresponding 95% confidence intervals (CIs) were calculated using the fixed-effects or random-effects model. For the chi-square test based on Cochran Q statistic, p-values <0.10 were considered to be statistically significant [14]. The I 2 metric was used to evaluate the heterogeneity among studies [15].
Hardy-Weinberg equilibrium was tested in the control groups using the chi-square test. Three genetic models were used to examine the association of ANRIL polymorphisms and risk of IS: (1) allele contrast (AC) (effect of each additional risk allele), (2) dominant model (DM), and (3) recessive model (RM). Multiple testing correction was conducted using the false discovery rate (FDR) method proposed by Benjamini and Hochberg. Inverted funnel plots and Egger's test were performed to detect publication bias in the analyses involving different genetic variants. Publication bias was considered to be present if the inverted funnel plot was asymmetric and/or Egger's test result was significant (p <0.10).
Sub-population analyses were conducted for ethnicity [16], and subgroup analyses for IS subtype, age, or sex (if available) were also performed [17]. A sensitivity analysis was performed with the exclusion of specific studies [18], such as poor-quality studies (NOS <7) or studies where no ANRIL genetic variants were found in either cases or controls. All statistical analyses were performed with the Cochrane Review Manager (RevMan, version 5.4) and STATA 16.0 package. A probability value of p<0.05 (two-tailed) was considered significant unless indicated otherwise.

Study selection and characteristics of eligible datasets
We found 856 records by primary searches in the databases and six additional records were identified from other sources, including 113 articles from English-language databases and 749 items from Chinese-language databases. Initially, 115 potentially relevant articles (16 in Chinese and 99 in English) were initially selected after reading the titles and abstracts. After reading the full text of these articles, 90 articles were excluded because of duplicates, reviews, mixed samples (transient ischemic attack or hemorrhagic stroke were not excluded), insufficient data, irrelevant content, genetic variants beyond the scope of this study, or ineligible study design. Finally, 25 articles (2 in Chinese and 23 in English)  involving 15 SNPs (rs2383207, rs10757274, rs10757278, rs2383206, rs1333040, rs1333049, rs1537378, rs4977574, rs1004638, rs7865618, rs10965227, rs1333042, rs7044859, rs10116277, and rs10757269) were found to be eligible for the meta-analysis after applying all the inclusion and exclusion criteria described above. The results of the systematic literature search and article selection are summarized in Fig 1. The excluded articles and the reasons for excluding each article are given in S3 Appendix.
Twenty-three of the included articles were full-length reports published in peer-reviewed journals [19-27, 29-33, 35-43], and two were Master degree thesis [28,34]. The characteristics of these studies and the ANRIL polymorphisms involved in the meta-analysis are summarized in Table 1. A summary of the total number of studies on different ANRIL SNPs is provided in Table 2.
Genetic association of 15 ANRIL SNPs with IS SNP rs2383207. The association of rs2383207 with IS risk was investigated in 12 studies [21, 23, 28-30, 32, 35, 38-41, 43] involving 11, 527 cases and 12, 216 controls.  No significant association of rs2383207 with IS was found under three genetic models in whole studied population, sub-populations, and IS subtypes. High heterogeneity was detected in the whole studied population (AC: I 2 = 82%, p <0.001; DM: I 2 = 71.6%, p <0.001; RM: I 2 = 74.5%, p <0.001) and in large-artery atherosclerosis (LAA) subtypes (AC: I 2 = 85.7%, p<0.001; DM: I 2 = 77.6%, p<0.001; RM: I 2 = 76.9%, p<0.001) with all three models; however, the heterogeneity disappeared when the Caucasian studies were excluded, suggesting that ethnicity (Caucasian) may be the source of heterogeneity. Meta-regression analysis to identify different sources of heterogeneity indicated that ethnicity may be linked to heterogeneity (p = 0.085), but this finding had no statistical significance. The sensitivity analysis excluding the poor-quality studies [29, 32, 35, 39] gave similar overall results, confirming that the results were stable and reliable. We did not find publication bias for this SNP using the funnel plots and Egger's test (p = 0.167 in the allelic comparison model).
In the IS subtype analyses, the G allele and GG genotype conferred susceptibility to LAA in the whole studied population (G allele: OR = 1.18, 95%CI: 1.08-1.30, p-FDR <0.001; GG genotype: OR = 1.31, 95%CI: 1.13-1.52, p-FDR <0.001), but mainly in Asians (G allele: OR = 1.18, 95%CI: 1.06-1.31, p-FDR = 0.003; GG genotype: OR = 1.33, 95%CI: 1.12-1.57, p-FDR = 0.003). In contrast, the AA genotype had a protective role in LAA only in the whole studied population (OR = 0.84, 95%CI = 0.73-0.96, p-FDR = 0.014). Sex had no effect in any of the comparisons. Significant heterogeneity among studies was detected only in the recessive model (GG/(AA +AG)) in the whole studied population (I 2 = 54.8%, p = 0.018) and in the Caucasians studies (I 2 = 78.3%, p = 0.003). The heterogeneity disappeared in the whole studied population (I 2 = 40%, p = 0.11) and in Caucasians (I 2 = 47%, p = 0.15) after excluding the study by Yamagishi et al. [25]. The sensitivity analyses after removing the one study with NOS <7 [26] did not alter the final results in any of the genetic comparisons in the whole studied population or in Caucasians, further confirming the reliability of the results. No significant publication bias was detected in all three genetic models. SNP rs10757278. The role of rs10757278 in IS was analyzed in 10 studies [20,23,24,27,28,30,35,36,40, 42] involving 9,352 cases and 2, 4552 controls. A positive association was found in the whole studied population, and in Asians and Caucasians with IS using the combined results. The G allele and GG genotype increased the susceptibility to IS in the whole studied population (G allele: OR = 1.  No heterogeneity was detected in any of the comparisons for IS subtypes. Additionally, no age difference was found in the three genetic models. The sensitivity analyses excluding the low-quality studies (NOS <7) [20,35] did not affect the stability of the original results. We found a publication bias in the allelic comparison in the whole studied population (p = 0.019, Egger's test) (Fig 5), indicating that more studies are needed to verify the conclusion. SNP rs2383206. The role of rs2383206 in IS was investigated in nine studies involving 4,431 cases and 8,423 controls) [19, 22-25, 28, 30, 35, 40]. The G allele and GG genotype The combined results showed that the TT genotype conferred increased risk (OR = 1.09, 95%CI: 1.00-1.19, p-FDR = 0.044) (Fig 8), and the C allele or CC genotype played a protective role in IS in the whole studied population (C allele: OR = 0.92, 95%CI: 0.88-0.97, p-FDR = 0.003; CC genotype: OR = 0.83, 95%CI: 0.73-0.94, p-FDR = 0.006). In contrast, in the sub-population analyses, the C allele showed a protective effect on IS, but only in in Caucasians (OR = 0.92, 95%CI: 0.86-0.98, p-FDR = 0.018).
No significant relationship of rs1333040 with LAA was found in the whole studied population; however, an association with LAA risk was found in Caucasians. Patients with the C allele and CC genotype had a lower possibility of developing LAA (C allele: OR = 0.86, 95%CI: 0.76- In contrast, patents with the TT genotype seemed to be more predisposed to LAA risk (OR = 1.20, 95%CI: 1.01,1.42, P-FDR = 0.037). No sex difference was found for IS in any of the comparisons. There was no significant heterogeneity among the studies.
The sensitivity analyses after excluding low-quality studies (NOS <7) [29, 32] did not alter the final results. No publication bias was detected in the three genetic models in the whole studied population (Egger's test for AC p = 0.772, for DM p = 0.502, for RM p = 0.875). SNP rs1333049. The role of rs1333049 in IS was analyzed in seven studies involving 5,351 cases and 6,061 controls [21,23,28,29,35,39,40]. Pooled analyses showed that the C allele increased the susceptibility to IS (OR = 1.09, 95%CI: 1.03-1.15, p-FDR = 0.009) in the whole studied population (Fig 9) and in Caucasians (OR = 1.15, 95%CI: 1.06-1.24, p-FDR = 0.001). No significant association was found in Asians, LAA subtype, or age subgroup (<45 vs. �45 years old). No heterogeneity was detected in any of the genetic comparisons.
The sensitivity analyses after removing low-quality studies (NOS <7) [29, 35, 39] remained unchanged in the three models in the whole studied population. No publication bias was found in three genetic models (Egger's test for AC p = 0.845, for DM p = 0.854, for RM p = 0.187).
SNP rs1537378. The role of rs1537378 in IS was analyzed in six studies [23,27,28, 35, 36, 40] involving 6,166 cases and 6,129 controls. The CC genotype was found to increase the risk for IS in the whole studied population (OR = 1.18, 95%CI: 1.09-1.27, p-FDR = 0.000) (Fig 10), In the IS subtype analyses, a significant relationship was found only in LAA. The LAA risk was higher in carriers with the CC genotype, and patients carrying the T allele and TT genotype had lower risk for LAA in the whole studied population, in Asians, and in Caucasians. In patients who were �45 years old, the CC genotype was also associated with higher risk for all types of IS, and only T allele had a protective role.
Significant heterogeneity among studies was found in the T allele (T/C) and CC genotype comparisons (CC vs. (CT+TT)) only in the whole studied population. The heterogeneity disappeared after removing the study by Bi et al. [36], which suggested it may be a source of heterogeneity; however, the final results remained unchanged. The sensitivity analyses after excluding the study with NOS = 6 [35] did not alter any of the results, indicating the reliability and stability of the original results. The funnel plot was asymmetric in all three genetic comparisons in the whole studied population (Egger's test for AC p = 0.019; for DM p = 0.033; for RM p = 0.046) (Fig 11), which suggested there might be some publication bias. The trim and

PLOS ONE
fill method was used to identify and correct the bias, and the combined effect was found to be unchanged, indicating that the possible publication bias had little effect on the results. SNP rs4977574. The role of rs4977574 in IS was analyzed in five studies [32, 33, 35, 37, 43] involving 6,083 cases and 4,593 controls that included three Asian, one Caucasian, and one mixed populations.
The IS subtype analysis showed that the G allele and GG genotype were risk factors for LAA in the whole studied population (G allele: OR = 1.  The sensitivity analysis after omitting two poor-quality studies [32,35] showed that the final pooled results were not affected. No publication bias was detected by the funnel plots or Egger's test in the three genetic models in the whole studied population. Other SNPs. For each of the remaining six SNPs, rs7865618, rs10965227, rs1333042, rs7044859, rs10116277, and rs10757269, only two studies with from 512 to 4,322 cases and from 752 to 4,477 controls, were included for meta-analyses. No significant association was found in any of the comparisons. Heterogeneity between studies, sensitivity analysis, and publication bias were not explored because of the small number of studies for each SNP.

Discussion
The meta-analysis results showed that eight SNPs (rs10757274, rs10757278, rs2383206, rs1333040, rs1333049, rs1537378, rs4977574, and rs1004638) in ANRIL were significantly associated with IS risk, and six of these SNPs (rs10757274, rs10757278, rs2383206, rs1333040, rs1537378, and rs4977574) were also found to be related to the LAA subtype of IS. Two of the SNPs (rs2383206 and rs4977574) were associated with IS mainly in Asians, and three SNPs (rs10757274, rs1333040, and rs1333049) were associated with susceptibility to IS mainly in Caucasians.
The locus close to a cluster of cell-cycle regulating genes in chromosome 9p21, such as CDKN2A and CDKN2B, regulates vascular remodeling pathways. The proteins encoded by these genes affect cell-cycle progression, resulting in an antiproliferative effect on arterial smooth muscle. In human white blood cells, the homozygous carriers of the 9p21 risk allele are associated with down-regulation of CDKN2B expression and up-regulation of genes involved in cellular proliferation. Markedly decreased expression of CDKN2A and CDKN2B was reported in mutant mice and doubling of the proliferative capacity of mutant aortic smooth muscle cells in culture was detected, a cellular phenotype relevant to atherosclerosis [44].
ANRIL encodes a large antisense long non-coding RNA in which the first exon is located in the CDKN2A promoter and overlaps with the two exons of CDKN2B. Expression of ANRIL co-clustered mainly with p14/ARF under both physiologic and pathologic conditions. The 9p21 region may promote atherosclerosis by regulating the expression of ANRIL, which in turn is associated with altered expression of genes that control cellular proliferation pathways [9]. ANRIL was recently shown to be expressed in human atheromatous vessels, including both abdominal aortic aneurysm and carotid endarterectomy samples, as well as in isolated vascular endothelial cells, monocyte-derived macrophages, and coronary smooth muscle cells. Moreover, ANRIL expression was significantly associated with the alteration in function of vascular endothelial cells and vascular smooth muscle cells in both human or animal models [45]. Together, these findings indicate that ANRIL has a direct effect on the pathobiology of atherosclerosis. Therefore, ANRIL is considered a good candidate for atherosclerotic disease risk, such as coronary artery disease (CAD) and IS [46,47].
Studies have shown that different ANRIL transcripts exhibit disease-specific expression patterns in CAD, which further supports the hypothesis that ANRIL is the causative gene at the 9p21 CAD susceptibility locus [48]. Recently, a few meta-analyses using SNPs also indicated a significant association of ANRIL with CAD [49][50][51][52][53][54][55]. IS is known to share common pathophysiological mechanisms with CAD, and CAD and IS seem to have common susceptibility locus. A comprehensive review indicated that increased ANRIL expression was associated with IS risk in animal models by promoting angiogenesis and regulating inflammation [56], and patients with IS were also found to have significantly higher serum ANRIL levels in clinical practice [57,58].
Some studies have explored the functional effect of SNPs in ANRIL. The rs1333049 risk allele (C allele) was found to influence ANRIL expression levels in vascular smooth muscle cells, which was associated with elevated levels of these cells in atherosclerosis plaques involved in the pathogenesis of atherosclerosis [59]. Rs1333040 is located in an intronic enhancer region that was found to influence the activity of the enhancer and ANRIL expression. Rs10757274 showed high linkage disequilibrium with myocardial infraction-associated SNPs, including rs1537373, rs4977575, and rs10757272, and contributed to the activation or inhibition of the expression of the related genes [55]. A few SNPs were found to have a significant relationship to vascular risk factors. Patients carrying mutant alleles of rs1333049 and rs4977574 had elevated total cholesterol, triglycerides, and low-density lipoprotein cholesterol levels [60][61][62]. The risk allele of rs4977574 was also found to be related to carotid plaque formation in patients with acute IS [63] or type 2 diabetes [64]. All of these factors may lead to the progression of atherosclerotic vascular diseases or IS.
A few meta-analyses have reported the association of ANRIL with IS; however, these metaanalyses have some limitations, such as failure to include all eligible studies [34,43,52,[65][66][67][68], no comprehensive analyses [66][67][68], confounding cases (patients with transient ischemic attack or other types of stroke were included in the IS samples) [34,52,65,67], as well as wrong SNP loci [65] or errors in extracting and analyzing data [34, 65,67], which could have influenced the overall results. Two previous genome-wide association studies (GWAS) [69,70] explored the relationship of ANRIL SNPs and IS in a Caucasian cohort with European ancestry, but only one SNP (rs2383207) was found to be association with LAA. Ethnicity may partly explain the discrepancy between the GWAS results and the results of the present meta-analysis, which included more Asians.
The potential biological mechanisms, including how ANRIL is strongly associated with the risk for cardio-metabolic diseases, are still unknown. Recent reports have found that the N4-acetylcytidine modification of RNA, which regulated gene expression, and microRNAmediated gene expression and immuno-deficiency in the gut microbiome, were key to cardiometabolic diseases, including IS [71][72][73][74][75][76][77][78]. However, the few studies that have investigated the role of ANRIL SNP loci in the N4-acetylcytidine regulatory pathway failed to find definite effects of RNA modification or immuno-deficiency on the development of IS. Our meta-analysis has some limitations. Firstly, there is language bias because we only searched studies of ANRIL polymorphisms on IS reported in Chinese and English, and therefore may have missed studies published in other languages. Secondly, the number of studies included in this meta-analysis was moderate, and seven of the SNPs (rs1004638, rs7865618, rs10965227, rs1333042, rs7044859, rs10116277, and rs10757269) were involved in three or less studies. Therefore, some results could be influenced by random error and/or publication bias. Thirdly, the presence of potential confounders between studies or between cases and controls within each study, such as age, sex, or ethnic admixture, were unadjusted that may have influenced the results. Fourthly, it is well known that it is very important to conduct causal inference analysis to determine if the associated genetic polymorphisms are causally triggering the development of IS by mediating the expression of this gene in specific tissues [79][80][81][82]. Although, this meta-analysis aimed to discuss the association of ANRIL with IS using SNPs as genetic marker, no causal genetic effects of ANRIL on IS can be established. Fifthly, machine learning is considered a useful tool for the classification and prediction of diseases based on biomarkers [83-86] that we have yet to use to analyze the role of ANRIL in susceptibility to IS. Sixthly, GWAS, case-only studies, and family-based studies were not included because of differences in study design, but they could be useful for meta-analysis in the future. Finally, the inter-study heterogeneity in the pooled analyses may have affected the results for several SNPs.
In summary, our accumulated pooled analyses indicate that ANRIL has a significant association with IS risk in Asian populations. The causal effects of the ANRIL SNPs associated with IS can be explored by Mendelian randomization analysis in the future.